NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Do Multi-Document Summarization Models Synthesize ?

https://doi.org/10.1162/tacl_a_00687

DeYoung, Jay; Martinez, Stephanie C; Marshall, Iain J; Wallace, Byron C (September 2024, Transactions of the Association for Computational Linguistics)
Louis, Annie (Ed.)
Abstract Multi-document summarization entails producing concise synopses of collections of inputs. For some applications, the synopsis should accurately synthesize inputs with respect to a key aspect, e.g., a synopsis of film reviews written about a particular movie should reflect the average critic consensus. As a more consequential example, narrative summaries that accompany biomedical systematic reviews of clinical trial results should accurately summarize the potentially conflicting results from individual trials. In this paper we ask: To what extent do modern multi-document summarization models implicitly perform this sort of synthesis? We run experiments over opinion and evidence synthesis datasets using a suite of summarization models, from fine-tuned transformers to GPT-4. We find that existing models partially perform synthesis, but imperfectly: Even the best performing models are over-sensitive to changes in input ordering and under-sensitive to changes in input compositions (e.g., ratio of positive to negative reviews). We propose a simple, general, effective method for improving model synthesis capabilities by generating an explicitly diverse set of candidate outputs, and then selecting from these the string best aligned with the expected aggregate measure for the inputs, or abstaining when the model produces no good candidate.
more » « less
Full Text Available
Automatically Extracting Numerical Results from RCTs with LLMs

Yun, Hye Sun; Pogrebitskiy, David; Marshall, Iain J; Wallace, Byron C (August 2024, Machine Learning for Healthcare (MLHC))

Full Text Available
Evidence Inference 2.0: More Data, Better Models

DeYoung, Jay; Lehman, Eric; Nye, Ben; Marshall, Iain J.; Wallace, Byron C. (July 2020, BioNLP: Workshop on Biomedical Natural Language Processing)

Full Text Available
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time

Nye, Benjamin E.; Nenkova, Ani; Marshall, Iain J.; Wallace, Byron C. (January 2020, Proceedings of the Association for Computational Linguistics (ACL))

Full Text Available
Trialstreamer: A living, automatically updated database of clinical trial reports

https://doi.org/10.1093/jamia/ocaa163

Marshall, Iain J; Nye, Benjamin; Kuiper, Joël; Noel-Storr, Anna; Marshall, Rachel; Maclean, Rory; Soboczenski, Frank; Nenkova, Ani; Thomas, James; Wallace, Byron C (September 2020, Journal of the American Medical Informatics Association)
null (Ed.)
Abstract Objective Randomized controlled trials (RCTs) are the gold standard method for evaluating whether a treatment works in health care but can be difficult to find and make use of. We describe the development and evaluation of a system to automatically find and categorize all new RCT reports. Materials and Methods Trialstreamer continuously monitors PubMed and the World Health Organization International Clinical Trials Registry Platform, looking for new RCTs in humans using a validated classifier. We combine machine learning and rule-based methods to extract information from the RCT abstracts, including free-text descriptions of trial PICO (populations, interventions/comparators, and outcomes) elements and map these snippets to normalized MeSH (Medical Subject Headings) vocabulary terms. We additionally identify sample sizes, predict the risk of bias, and extract text conveying key findings. We store all extracted data in a database, which we make freely available for download, and via a search portal, which allows users to enter structured clinical queries. Results are ranked automatically to prioritize larger and higher-quality studies. Results As of early June 2020, we have indexed 673 191 publications of RCTs, of which 22 363 were published in the first 5 months of 2020 (142 per day). We additionally include 304 111 trial registrations from the International Clinical Trials Registry Platform. The median trial sample size was 66. Conclusions We present an automated system for finding and categorizing RCTs. This yields a novel resource: a database of structured information automatically extracted for all published RCTs in humans. We make daily updates of this database available on our website (https://trialstreamer.robotreviewer.net).
more » « less
Full Text Available

Search for: All records